Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Chem Inf Model ; 64(3): 690-696, 2024 Feb 12.
Artículo en Inglés | MEDLINE | ID: mdl-38230885

RESUMEN

The Kováts retention index (RI) is a quantity measured using gas chromatography and is commonly used in the identification of chemical structures. Creating libraries of observed RI values is a laborious task, so we explore the use of a deep neural network for predicting RI values from structure for standard semipolar columns. This network generated predictions with a mean absolute error of 15.1 and, in a quantification of the tail of the error distribution, a 95th percentile absolute error of 46.5. Because of the Artificial Intelligence Retention Indices (AIRI) network's accuracy, it was used to predict RI values for the NIST EI-MS spectral libraries. These RI values are used to improve chemical identification methods and the quality of the library. Estimating uncertainty is an important practical need when using prediction models. To quantify the uncertainty of our network for each individual prediction, we used the outputs of an ensemble of 8 networks to calculate a predicted standard deviation for each RI value prediction. This predicted standard deviation was corrected to follow the error between the observed and predicted RI values. The Z scores using these predicted standard deviations had a standard deviation of 1.52 and a 95th percentile absolute Z score corresponding to a mean RI value of 42.6.


Asunto(s)
Inteligencia Artificial , Redes Neurales de la Computación , Incertidumbre
2.
J Proteome Res ; 22(7): 2246-2255, 2023 07 07.
Artículo en Inglés | MEDLINE | ID: mdl-37232537

RESUMEN

The unbounded permutations of biological molecules, including proteins and their constituent peptides, present a dilemma in identifying the components of complex biosamples. Sequence search algorithms used to identify peptide spectra can be expanded to cover larger classes of molecules, including more modifications, isoforms, and atypical cleavage, but at the cost of false positives or false negatives due to the simplified spectra they compute from sequence records. Spectral library searching can help solve this issue by precisely matching experimental spectra to library spectra with excellent sensitivity and specificity. However, compiling spectral libraries that span entire proteomes is pragmatically difficult. Neural networks that predict complete spectra containing a full range of annotated and unannotated ions can be used to replace these simplified spectra with libraries of fully predicted spectra, including modified peptides. Using such a network, we created predicted spectral libraries that were used to rescore matches from a sequence search done over a large search space, including a large number of modifications. Rescoring improved the separation of true and false hits by 82%, yielding an 8% increase in peptide identifications, including a 21% increase in nonspecifically cleaved peptides and a 17% increase in phosphopeptides.


Asunto(s)
Biblioteca de Péptidos , Proteoma , Proteoma/metabolismo , Inteligencia Artificial , Espectrometría de Masas en Tándem , Algoritmos , Fosfopéptidos , Bases de Datos de Proteínas , Programas Informáticos
4.
Artículo en Inglés | MEDLINE | ID: mdl-31427293

RESUMEN

Antimicrobial resistance (AMR) is a major public health problem that requires publicly available tools for rapid analysis. To identify AMR genes in whole-genome sequences, the National Center for Biotechnology Information (NCBI) has produced AMRFinder, a tool that identifies AMR genes using a high-quality curated AMR gene reference database. The Bacterial Antimicrobial Resistance Reference Gene Database consists of up-to-date gene nomenclature, a set of hidden Markov models (HMMs), and a curated protein family hierarchy. Currently, it contains 4,579 antimicrobial resistance proteins and more than 560 HMMs. Here, we describe AMRFinder and its associated database. To assess the predictive ability of AMRFinder, we measured the consistency between predicted AMR genotypes from AMRFinder and resistance phenotypes of 6,242 isolates from the National Antimicrobial Resistance Monitoring System (NARMS). This included 5,425 Salmonella enterica, 770 Campylobacter spp., and 47 Escherichia coli isolates phenotypically tested against various antimicrobial agents. Of 87,679 susceptibility tests performed, 98.4% were consistent with predictions. To assess the accuracy of AMRFinder, we compared its gene symbol output with that of a 2017 version of ResFinder, another publicly available resistance gene detection system. Most gene calls were identical, but there were 1,229 gene symbol differences (8.8%) between them, with differences due to both algorithmic differences and database composition. AMRFinder missed 16 loci that ResFinder found, while ResFinder missed 216 loci that AMRFinder identified. Based on these results, AMRFinder appears to be a highly accurate AMR gene detection system.

5.
Proteomics ; 13(10-11): 1692-5, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-23533138

RESUMEN

The PRIDE database, developed and maintained at the European Bioinformatics Institute (EBI), is one of the most prominent data repositories dedicated to high throughput MS-based proteomics data. Peptidome, developed by the National Center for Biotechnology Information (NCBI) as a sibling resource to PRIDE, was discontinued due to funding constraints in April 2011. A joint effort between the two teams was started soon after the Peptidome closure to ensure that data were not "lost" to the wider proteomics community by exporting it to PRIDE. As a result, data in the low terabyte range have been migrated from Peptidome to PRIDE and made publicly available under experiment accessions 17 900-18 271, representing 54 projects, ~53 million mass spectra, ~10 million peptide identifications, ~650,000 protein identifications, ~1.1 million biologically relevant protein modifications, and 28 species, from more than 30 different labs.


Asunto(s)
Bases de Datos de Proteínas , Proteoma/química , Almacenamiento y Recuperación de la Información , Anotación de Secuencia Molecular , Proteómica , Espectrometría de Masas en Tándem
6.
Proteomics ; 10(16): 3035-9, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20564260

RESUMEN

We present MassSieve, a Java-based platform for visualization and parsimony analysis of single and comparative LC-MS/MS database search engine results. The success of mass spectrometric peptide sequence assignment algorithms has led to the need for a tool to merge and evaluate the increasing data set sizes that result from LC-MS/MS-based shotgun proteomic experiments. MassSieve supports reports from multiple search engines with differing search characteristics, which can increase peptide sequence coverage and/or identify conflicting or ambiguous spectral assignments.


Asunto(s)
Biología Computacional/métodos , Minería de Datos/métodos , Mapeo Peptídico/métodos , Programas Informáticos , Espectrometría de Masas en Tándem/métodos , Algoritmos , Fragmentos de Péptidos/química , Estadísticas no Paramétricas , Interfaz Usuario-Computador
7.
Nucleic Acids Res ; 38(Database issue): D731-5, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19942688

RESUMEN

Peptidome is a public repository that archives and freely distributes tandem mass spectrometry peptide and protein identification data generated by the scientific community. Data from all stages of a mass spectrometry experiment are captured, including original mass spectra files, experimental metadata and conclusion-level results. The submission process is facilitated through acceptance of data in commonly used open formats, and all submissions undergo syntactic validation and curation in an effort to uphold data integrity and quality. Peptidome is not restricted to specific organisms, instruments or experiment types; data from any tandem mass spectrometry experiment from any species are accepted. In addition to data storage, web-based interfaces are available to help users query, browse and explore individual peptides, proteins or entire Samples and Studies. Results are integrated and linked with other NCBI resources to ensure dissemination of the information beyond the mass spectroscopy proteomics community. Peptidome is freely accessible at http://www.ncbi.nlm.nih.gov/peptidome.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Bases de Datos de Proteínas , Espectrometría de Masas/métodos , Proteómica/métodos , Biología Computacional/tendencias , Perfilación de la Expresión Génica , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , National Library of Medicine (U.S.) , Péptidos/química , Estructura Terciaria de Proteína , Programas Informáticos , Estados Unidos
9.
Mol Cell Proteomics ; 6(10): 1749-60, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17623647

RESUMEN

Postsynaptic density protein 95 (PSD-95), a specialized scaffold protein with multiple protein interaction domains, forms the backbone of an extensive postsynaptic protein complex that organizes receptors and signal transduction molecules at the synaptic contact zone. Large, detergent-insoluble PSD-95-based postsynaptic complexes can be affinity-purified from conventional PSD fractions using magnetic beads coated with a PSD-95 antibody. In the present study purified PSD-95 complexes were analyzed by LC/MS/MS. A semiquantitative measure of the relative abundances of proteins in the purified PSD-95 complexes and the parent PSD fraction was estimated based on the cumulative ion current intensities of corresponding peptides. The affinity-purified preparation was largely depleted of presynaptic proteins, spectrin, intermediate filaments, and other contaminants prominent in the parent PSD fraction. We identified 525 of the proteins previously reported in parent PSD fractions, but only 288 of these were detected after affinity purification. We discuss 26 proteins that are major components in the PSD-95 complex based upon abundance ranking and affinity co-purification with PSD-95. This subset represents a minimal list of constituent proteins of the PSD-95 complex and includes, in addition to the specialized scaffolds and N-methyl-d-aspartate (NMDA) receptors, an abundance of alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA) receptors, small G-protein regulators, cell adhesion molecules, and hypothetical proteins. The identification of two Arf regulators, BRAG1 and BRAG2b, as co-purifying components of the complex implies pivotal functions in spine plasticity such as the reorganization of the actin cytoskeleton and insertion and retrieval of proteins to and from the plasma membrane. Another co-purifying protein (Q8BZM2) with two sterile alpha motif domains may represent a novel structural core element of the PSD.


Asunto(s)
Proteínas del Tejido Nervioso/análisis , Sinapsis/química , Animales , Cromatografía de Afinidad , Electroforesis en Gel de Poliacrilamida , Proteínas del Tejido Nervioso/aislamiento & purificación , Ratas , Ratas Sprague-Dawley
10.
Proteomics ; 3(9): 1687-91, 2003 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-12973726

RESUMEN

Mass spectrometry data is inherently uncertain. Rather than compare peak heights across samples, a comparison can be made of the relative ordering of the peak height across samples. Order statistics are used to provide a distance metric between each ordered list of peak heights from the samples. A principal component analysis is performed on the set of distance vectors to highlight to important components.


Asunto(s)
Espectrometría de Masas/estadística & datos numéricos , Péptidos/análisis , Espectrometría de Masas/métodos , Análisis de Componente Principal
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...